
********************************************************************************
* Omitted variable bias example in Stata
* Henrique Castro Martins - hcm@iag.puc-rio.br
* If you find any mistake, please let me know
********************************************************************************

import excel "data.xlsx", sheet("Planilha1") firstrow clear

* short regression in groups A and B
eststo shortreg : reg performance  bad_decision  

* Long regression in groups A and B
eststo longreg  : reg performance bad_decision  risky_firm

di .4453465 - -.3838937  

* The OVB is short    = long     + bias
* The OVB is 0.44535  = -0.38389 + bias
* The OVB is 0.44535  = -0.38389 + 0.82924
* The OVB is 0.44535  = -0.38389 + phi (which is omitted = f(non-omitted)) * omega (beta of omitted in long)

* Omega (beta of omitted in long)
gen omega = _b[risky_firm]

* Phi in omitted = f(non-omitted)
eststo ovb: reg risky_firm  bad_decision 

gen phi = _b[bad_decision]

* The OVB is 0.44535  = -0.38389 + 1.25146 * 0.66262

* Calculating OVB
di phi * omega
    
* summary
estpost tabstat performance bad_decision , by(risky_firm)

